Semi-Supervised Instance Population of an Ontology using Word Vector Embeddings

نویسندگان

  • Vindula Jayawardana
  • Dimuthu Lakmal
  • Nisansa de Silva
  • Amal Shehan Perera
  • Keet Sugathadasa
  • Buddhi Ayesha
  • Madhavi Perera
چکیده

In many modern day systems such as information extraction and knowledge management agents, ontologies play a vital role in maintaining the concept hierarchies of the selected domain. However, ontology population has become a problematic process due to its nature of heavy coupling with manual human intervention. With the use of word embeddings in the filed of natural language processing, it became a popular topic due to its ability to cope up with semantic sensitivity. Hence, in this study we propose a novel way of semi-supervised ontology population through word embeddings as the basis. We built several models including traditional benchmark models and new types of models which are based on word embeddings. Finally, we ensemble them together to come up with a synergistic model with better accuracy. We demonstrate that our ensemble model can outperform the individual models. keywords: Ontology, Ontology Population, Word Embeddings, word2vec

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting Semi-Supervised Learning with Graph Embeddings

We present a semi-supervised learning framework based on graph embeddings. Given a graph between instances, we train an embedding for each instance to jointly predict the class label and the neighborhood context in the graph. We develop both transductive and inductive variants of our method. In the transductive variant of our method, the class labels are determined by both the learned embedding...

متن کامل

Semi-supervised Bio-named Entity Recognition with Word-Codebook Learning

We describe a novel semi-supervised method called WordCodebook Learning (WCL), and apply it to the task of bionamed entity recognition (bioNER). Typical bioNER systems can be seen as tasks of assigning labels to words in bioliterature text. To improve supervised tagging, WCL learns a class of word-level feature embeddings to capture word semantic meanings or word label patterns from a large unl...

متن کامل

Improved CCG Parsing with Semi-supervised Supertagging

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal. We show how a state-of-theart CCG parser can be enhanced, by predicting lexical categories using unsupervised vector-space embeddings of words. The use of word embeddings enables our model to better generalize from the labelled data, and allows us to ...

متن کامل

Improved CCG Parsing with Semi-supervised Supertagging

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal. We show how a state-of-theart CCG parser can be enhanced, by predicting lexical categories using unsupervised vector-space embeddings of words. The use of word embeddings enables our model to better generalize from the labelled data, and allows us to ...

متن کامل

Learning Word Embeddings for Hyponymy with Entailment-Based Distributional Semantics

Lexical entailment, such as hyponymy, is a fundamental issue in the semantics of natural language. This paper proposes distributional semantic models which efficiently learn word embeddings for entailment, using a recently-proposed framework for modelling entailment in a vectorspace. These models postulate a latent vector for a pseudo-phrase containing two neighbouring word vectors. We investig...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017